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Abstract. We present a new statistical method to optimally 
link local weather extremes to large-scale atmospheric circu- 
lation structures. The method is illustrated using July-August 
daily mean temperature at 2m height (T2m) time-series over 
the Netherlands and 500 hPa geopotential height (Z500) 
time-series over the Euroatlantic region of the ECMWF re- 
analysis dataset (ERA40). The method identifies patterns 
in the Z500 time-series that optimally describe, in a pre- 
cise mathematical sense, the relationship with local warm 
extremes in the Netherlands. Two patterns are identified; the 
most important one corresponds to a blocking high pressure 
system leading to subsidence and calm, dry and sunny con- 
ditions over the Netherlands. The second one corresponds 
to a rare, easterly flow regime bringing warm, dry air into 
the region. The patterns are robust; they are also identified 
in shorter subsamples of the total dataset. The method is 
generally applicable and might prove useful in evaluating the 
performance of climate models in simulating local weather 
extremes. 



1 Introduction 

Weather extremes such as extreme wind speeds, extreme 
precipitation or extreme warm or cold conditions are expe- 
rienced locally. They are usually connected to circulation 
structures of much larger scale in the atmosphere. For exam- 
ple, if we restrict ourselves to the Netherlands, a well-known 
circulation structure that often leads to extreme hot summer 
days is a high pressure system that blocks the inflow of cooler 
maritime air masses. Moreover, the subsidence of air in its 
interior leads to clear skies and an abundance of sunshine 
that leads to high temperatures. If the blocking high persists 
and depletes the soil moisture due to lack of precipitation and 
increased evaporation, temperatures tend to soar, as it did in 



the European summer of 2003 ISchar et alJ (120041) . Specu- 
lations about a positive feedback of dry soil on the persis- 
tence of the blocking high can also be found in the literature 



Ferranti and Viterbo (2006). 
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In order for climate models to correctly simulate the 
probability of extreme hot summer days, a crucial ingredi- 
ent is the correct simulation of the probability of the oc- 
currence of blocking. This is a well-known difficult fea- 
ture of the atmospheric c irculation to simulate realistically 
iPellv and Hoskinsl (120031) . The verification of models w.r.t. 
this aspect is, in practice, difficult as well, since idealized 
model experiments suggest a high degree of internal vari- 
abilit y of blocking frequ encies even on decadal timescales 
iLiu and Opsteeghl(ll995l) . 

In a world with increasing concentrations of greenhouse 
gases, not only the temperature increases, also the large-scale 
circulation adjusts to achieve a new (thermo)dynamical bal- 
ance. Models disagree on t he magnitude and even the di- 
rection of this change locallv lvan Ulden and van Oldenborgh 
(2006). For instance, a change in the probability of European 
blocking conditions in summer immediately impacts the fu- 
ture probability of European heat waves. This makes prob- 
ability estimates of future European heat waves very uncer- 
tain. To address the questions concerning the probability of 
future extreme weather events, and the evaluation of climate 
model simulations in this respect, it is necessary to have a de- 
scriptive method that links local weather extremes to large- 
scale circulation features. To the best of our knowledge, an 
optimal method to do so does not exist in the literature. 

We identified two approaches in the literature to link lo- 
cal weather extremes to large-scale circulation features. In 
the first one, the circulation anomalies are classified first, 
the connection with local extremes is analyzed in second in- 
stance. The "Grosswetterlagen" developed by synopt ic me- 
teorolo gists for instance is one such classification iKvselv 
(2002) . All kinds of clustering a l gorithms are anothe r ex- 
ample IPlaut and Simonnea (120011) : ICassou et alJ (120051) . In 
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Fig. 1. The leading two EOFs for the July and August Z500 daily anomaly field for 43 years of ERA-40 data (1958-2000). Left figure: 
first EOF; right figure: second EOF. Relative importances are 12.57% and 11.79% respectively. The patterns have been multiplied by one 
standard deviation of the corresponding amplitude time-series (in meters). 



our opinion, this approach is not optimal since in the defi- 
nition of the patterns, information about the extreme is not 
taken into account. 

In the second approach, a measure of the local extreme 
does enter the definition of the large-scale circulation pat- 
terns. For instance, only atmospheric states are consid- 
ered for which the local extreme occurs. Next a simple 
averaging operator is applied ["composite method" as in 
Schaefferet al. (2005)]or a cluster ing analysis is performed 
Sanchez-Gomez and Terravl (|2005) . The composite method 
falls short since it finds by definition only one typical circu- 
lation anomaly and from synoptic experience we know that 
often different kind of circulation anomalies lead to a similar 
local weather extreme. The clustering analysis is debatable 
since the data record is often to o short to identify clust ers 
with enough statistical confidence iHsu and Zwiersl yOOl). 

The purpose of this paper is to report a new optimal 
method to relate local weather extremes to characteristic 
circulation patterns. This method objectively identifies, in 
a robust manner, the different circulation patterns that fa- 
vor the occurrence of local weather extremes. The method 
is inspired by the Optimal Autocorrelation Functions of 
Selten et al.l (11999b . It is based on considering linear com- 
binations of the dominant Empirical Orthogonal Functions 
that maximize a suitable statistical quantity. We illustrate our 
method by analyzing the statistical relation between extreme 
high daily mean temperatures at two meter height (T2m) in 
July and August in the Netherlands and the structure of the 
large-scale circulation as measured by the 500 hPa geopoten- 
tial height field (Z500). 

This paper is divided into five sections. Section |2] is fo- 
cused on the data, where we explain the method to obtain 
the daily Z500 and T2m anomalies in Europe, and report the 
results of the EOF analysis of the Z500 anomaly data. In 
Sec. |3]we outline the procedure to optimize the quantity that 
describes the statistical relation between the Z500 and the 



extreme T2m anomalies, supported by the additional details 
in the Appendix. In Sec. [4] we identify the large-scale Z500 
anomaly patterns that are associated with hot summer days in 
the Netherlands, demonstrate the robustness of our method 
and compare the patterns with patterns earlier reported in the 
literature. Finally we conclude this report in Sec. [5] with a 
discussion on the possible applications of our method. 



2 The T2m and Z500 datasets, and EOF analysis of the 
Z500 data 

Our data have been obtained from the ERA-40 reanaly- 
sis dataset, for the timespan Sept. 1957 to Aug. 2002, 
at 6 hourly intervals on a 2.5° x 2.5° latitude-longitude 
grid. These data are publicly available at the ECMWF 
website http : //data.ecmwf.int/data/d/era40_daily/. The 
T2m data over entire Europe, defined by 37.5°N-70°N and 
10°W-40°E, and the Z500 data over 20°N-90°N and 60°W- 
60°E were downloaded. From these, the daily averages for 
T2m and Z500 fields for the years 1958-2000 (all together 
43 years in total) were computed. This formed our full raw 
dataset. 

In order to remove possible effects of global warming in 
the last decades of 20th century, detrending these fields prior 
to performing further calculations would be necessary. How- 
ever, an analysis of the Z500 daily averaged field revealed 
no significant linear trend over these 43 years. Therefore the 
Z500 daily anomaly field was obtained by simply removing 
the seasonal cycle defined by an average over the entire pe- 
riod of 43 years. Greatbatch and Rong (2006) showed that 
over Europe, the trends in the ERA-40 reanalysis and NCEP- 
NCAR reanalysis are indeed small and similar. 

A warming trend, however, is clearly present in the T2m 
field. For detrending the T2m field, the monthly averages for 
July and August were calculated from the daily averages at 
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each gridpoint. Next, 1 1 -year running means were computed 
for these monthly averaged T2m fields (for July and August 
separately), and that formed our baseline for calculating daily 
T2m anomaly field. This procedure does not yield the base- 
line for the first and the last 5 years (1958-1962 and 1996- 
2000); these were computed by extrapolating the baseline 
trend for the years 1963-1964 and 1995-1996 respectively. 

For the EOF analysis of the Z500 anomaly field, note that 
most of the variance of atmospheric v ariability resides in th e 
low-frequency part [10-90 day range iMalone et al.1 (119841) 1. 
Indeed, the dominant EOFs of Z500 anomaly fields proved 
insensitive to the application of 3-day, 5-day, 7-day, 9-day 
and 15-day running mean filters. For the sake of simplicity, 
therefore, we decided to only consider EOFs based on daily 
Z500 anomaly fields. The EOF analysis was performed on 
the regular lat-lon grid data with each grid point weighted by 
the cosine of its latitude to account for the different sizes of 
the grid cells. Using these weights, the EOFs e k are orthog- 
onal in space (note here that we use the same definition of 
vector dot product in space all throughout this paper) 



N 



<&)e;(A;,0i)cos(<^) = 4 



(1) 



Eili cos(^) i=1 
where <f> denotes latitude, A longitude and N the total number 
of grid points, and Ski is the Kronecker delta. Each Z500 
anomaly field can be expressed in terms of the EOFs as 



Z500(i) 



N 

E 

fc=i 



a k (t)e k 



(2) 



where the amplitudes a k are found by a projection of the 
Z500 anomaly on to the EOFs 

a k (t) = Z500{t) ■ e k . (3) 

A nice property of the EOFs is that their amplitude time- 
series are uncorrelated in time at zero lag 

(a k (t)ai(t))=a 2 k S k i, (4) 

where the angular brackets (.) denote a time average and o\ 
denotes the eigenvalue of the fc-th EOF which is equal to the 
variance of the corresponding amplitude time-series. 

We found that July and August months produced very sim- 
ilar EOFs, while June and September EOFs were signifi- 
cantly different. We therefore decided to restrict the summer 
months to July and August. The leading two EOFs for the 
corresponding daily Z500 anomaly for 1958-2000 are shown 
in Fig. [T] The values correspond to one standard deviation 
of the corresponding amplitude. The two EOFs are not well 
separated (the eigenvalues are close together) a nd therefore 



we exp ect some mixing between the two patterns Nor th et al. 



;xp< 
12). 



(1982J). A inear combination of the two EOFs shifts the lon- 



gitudinal position of the strong anomaly over Southern Scan- 
dinavia which is present in the first EOF. It resembles the 
summer NAO pattern as diagnosed by Greatbatch and Rong 
Greatbatch and Rong] d2006l) (their figure 8). 



3 Optimization procedure to establish the connection 
between Z500 anomalies and local extreme T2m 

One of the first approaches we considered to establish the 
connection between Z500 daily anomaly fields and extreme 
daily T2m is the so-called "clustering method", which iden- 
tifies clusters of points in the vector space spanned by the 
dominant EOFs. The daily Z500 anomaly field for July and 
August over 43 years yields us precisely 2666 datapoints 
in this vector space. A projection of these daily anoma- 
lies on the two-dimensional vector space of the two lead- 
ing EOFs (EOF1 and EOF2) is shown in Fig. [2] No clear 
clusters are apparent by simple visual inspection. One can 
imagine that defining clusters using existing cluster algo- 
rithms to identify clusters of points that correspond to spe- 
cific large-scale circulation patterns that occur significantly 
more frequently than others is not a trivial undertaking. Of- 
ten it turns out that using 40 years of data or so, the clusters 
identified ar e the result of s a mpling errors, due to too few 
data points Hsu and Zwiersl d200ll); 
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(120071) : IStephenson and O'Ne ill (2004). 
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Fig. 2. Projection of the daily Z500 anomaly field for July and 
August months for 43 years in the two-dimensional vector space 
spanned by the two leading EOFs. 

Nevertheless, when we plot the T2m positive anomaly val- 
ues at the center of the Netherlands (52.5°N, 5°E) in a scatter 
plot with the amplitude of EOF1, a distinct "tilt" in the scat- 
ter plot emerges: i.e., with increasing amplitude of the lead- 
ing EOF, the likelihood of having very hot summer days in- 
creases. Having inspected the same plots for the other EOFs 
we found a similar tilt for some of the other EOFs as well. 
From this point of view, finding the statistical relationship 
between T2m at a given place and the state of the large-scale 
atmospheric circulation can be reduced to a mathematical ex- 
ercise that finds those linear combinations of EOFs that op- 
timally bring out this tilt. In the remainder of this section, 
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supported by the Appendix, we present a general, rigorous 
and robust procedure to achieve this. 
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Fig. 3. Scatter plot of T2m > at the center of the Netherlands 
vs. the amplitude of leading Z500 EOF (EOF1). With increasing 
amplitude of the leading EOF, the likelihood of having very hot 
summer days increases. 

To represent this statistical relationship, we start by defin- 
ing the following dimensionless quantity 



{b[ L \t)[T{t)Y 



{ b[ L \t) ) P 3 <[T(t)]») 



(5) 



Here the angular brackets (.) p denotes a time average taken 
only over those days for which T2m(t) > 0, and n is a pos- 
itive number > 1. The idea behind choosing n > 1 is that 
for higher T(t) it gives larger contribution to rl : we are in- 
terested in high-temperature days at gridpoint G, we choose 
n = 2 for this study. The variable i>l (i) is the amplitude 
on day t of a pattern, defined as a linear combination of the 
first L EOFs. Since L linear combinations can be defined 
that form a new complete basis in the subspace of the first L 
EOFs we use the subscript k to denote these different linear 
combinations. 

We first concentrate on the calculation of the first pattern. 
Using to denote the coefficients of this first linear com- 
bination then 



aj(t). 



(6) 



Notice that since the time averages are taken only over 
those days for which T2m(t) > 0, {b[ L \t)) p / 0, although 
{b[ L \t)) = since {a jit)) = 0. 

Equations (06]l imply that given the time-series of T2m 
and Z500 anomalies, the numerical value of r± depends 



only on L and on the coefficients Cj± . For a given value of 



L, 



„(£) 



are found by maximizing the square of within 



the vector space of the first L EOFs (the square is taken since 



can take on negative values as well). 



If we define for T(t) > 0, 
f(t)= [T{tr 

then Eq. © can be rewritten for k = 1 as 

(b{ L) (t)f v (t)) v 



f) = 

(\b[ L) (t)} 2 )l 



In words, maximizing 



(7) 



(8) 



defines a pattern that for a 
change of one standard deviation in its amplitude b\ brings 
about the largest change in the normalized positive tempera- 
ture anomaly T p or put differently the local temperature re- 
sponds most sensitively to changes in the normalized ampli- 
tude of this pattern. In this sense, the pattern is optimally 
linked to the local warm temperature extremes. 

It is shown in the appendix that maximizing r\ cor- 
responds to the linear least squares fit of the EOF amplitude 
timeseries to T p (t) 



with the coefficients Cj\ given by 

L 



(9) 



(10) 



This result makes sense since the linear least squares fit 
optimally combines the EOF amplitude timeseries to min- 
imize the mean squared error between the actual tempera- 
ture anomaly and the temperature anomaly estimated from 
the circulation anomaly at that day. 

The procedure to find the remaining [L — 1) linear com- 
binations is as follows. We first reduce the Z500 anomaly 
fields to the [L — 1) dimensional subspace Z500' L_1 - > that 
is orthogonal to the first linear combination. In this subspace 
we again determine the linear combination that optimizes 



By construction, this value is lower than r[ L \ This 
procedure is repeated to determine all L linear combinations 
with decreasing order of optimized values . 

There is no unique way to define the subspaces and how 
this is done affects the properties of the linear combinations. 
The linear combinations can either (a) be constructed to form 
an orthonormal basis in space, in which case their amplitudes 
are temporally correlated; or (b) they can be constructed so 
that the corresponding amplitudes are temporally uncorre- 
lated, but in that case they are not orthonormal in space. In 
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Fig. 4. The behavior of r as a function of L for the first (red) and second (blue) EAFs. Left panel: patterns are orthogonal in space, but are 
correlated in time [option (a) in text]; right panel: patterns are uncorrected in time but are not orthogonal in space [option (b) in text]. 



both cases, they form a complete basis in the space of the 
first L EOFs 



(11) 



k=l 



We will call the patterns Extreme Associated Func- 
tions (EAFs). The mathematical details on how to obtain 
(t) for both options can be found in the Appendix. 



4 Statistical relationship between high summer temper- 
ature in the Netherlands and large-scale atmospheric 
circulation structures 

We now need a criterion to determine the optimal number 
of EOFs in the linear combinations. The reason for limiting 
the number of EOFs in the linear combinations is apparent 
from Eq. H0\ . Here the inverse of the covariance matrix 
of the EOF amplitudes appears. This matrix becomes close 
to singular when low-variance EOFs are included in the lin- 
ear combination. This makes the solution for the coefficients 



l. il l-determined fsee the general linear least squares sec- 
tion in lPress et al. (1986) for a detailed discussion on this is- 
sue]. Typically what is observed is that the inclusion of many 
more low-variance EOFs only marginally improves the rj^ 
values, but that the corresponding patterns describe less vari- 
ance and become "noisier" i.e. project onto Z500 variations 
at progressively smaller wavelengths. The optimal value of 
L in a statistical procedure like this, denoted by L c , is subjec- 
tive, but nevertheless can be found from a tradeoff between 
the amount of variance that the patterns describe and their 
r-values. 



The procedure to determine L c for the daily summer (July 
and August) temperature in the Netherlands [represented by 
T2m at (52.5°N,5°E)] and Z500 daily anomaly field over the 
region 20°N-90°N and 60°W-60°E for 43 years (1958-2000) 



is as follows. As can be expected, both r[ L ^ and rif 1 are 
increasing functions with L [Fig. SJleft)] and the variance 
associated with the corresponding EAFs tends to decrease 
with increasing L (not shown here). For option (a), both 
and improve significantly when including EOF12 in the 
linear combination; at the same time the variance of EAF1 
decreases and the variance of EAF2 increases. Also the cor- 
responding patterns change markedly. Between L = 12 and 
L = 15 the patterns, r-values and variances remain relatively 
unchanged. Beyond L = 15 the r-values steadily increase, 
the variance decreases and the patterns become "noisier". Si- 
multaneously, the temporal correlation between the dominant 
two EAF patterns steadily increases with L. For large L, as 
Fig. Hfleft) shows, both and r% values saturate to val- 
ues very close to each other, and the solution tends to become 
degenerate. Our interpretation of this is that the information 
that is contained in the Z500 anomaly fields about the lo- 
cal temperatures in the Netherlands is shared among increas- 
ingly more patterns, which is an undesirable characteristic. 
For example, for L = 12, the temporal correlation between 
EAF1 and EAF2 is 0.58, for L = 50 it is 0.93. Based on 
these findings, we consider L c to be equal to 12. 

A similar graph for EAFs calculated following option (b) 
are also displayed in Fig. fright). By construction, the value 
of r[ L ^ is the same. In this case, the variance decreases as 
well with increasing L, but much less so. The corresponding 
patterns are quite stable beyond L = 19. It is only beyond 
L = 200 or so that the second EAF more and more resembles 
the first EAF; for L = 19 the spatial correlation between 
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Fig. 5. The leading two Z500 daily anomaly patterns (EAFs) that are associated with warm July and August daily temperatures in the 
Netherlands: EAFs orthogonal in space, corresponding to L c — 12 (top panel); EAF amplitudes uncorrected in time, corresponding to 
L c = 50 (bottom panel). The first EAFs are shown on the left, and the second EAFs are shown on the right. All patterns have been 
multiplied by one standard deviation of the corresponding amplitude time-series (in meters). 



EAF1 and EAF2 is only 0.2 (they are almost orthogonal), for 
L = 200 it is 0.4 and for L = 500 it is 0.8. By construction, 
the temporal correlation between EAF1 and EAF2 remains 
zero. In this case, the choice of L is not so critical and we 
simply choose L c = 50. 

The results for the spatially orthogonal EAFs correspond- 
ing to L c = 12 and that for EAFs uncorrected in time cor- 
responding to L c = 50 are shown in Fig. The first EAFs 
obtained from options (a) and (b) are very similar; the differ- 
ences in the second are bigger. The first corresponds to a high 
pressure system, leading to clear skies over the Netherlands, 
an abundance of sunshine and a warm southeasterly flow. In 
addition to this circulation anomaly, the method finds another 
pattern that occurs less often; EAF2 corresponds to an east- 
erly flow regime bringing warm dry continental air masses 
to the Netherlands. Option (b) gives a more localized Z500 
anomaly pattern, with a warm, easterly flow into the Nether- 
lands. Option (a) also captures the warm, easterly flow, but 
is less localized and is less well defined as a function of L. 
The rjj value is larger for option (a), but it is temporally 
correlated to the first EAF. This implies that part of the in- 



formation about the local warm temperatures in the Nether- 
lands that is contained in the amplitude timeseries of EAF2 
is already captured by EAF1; they are not independent. The 
rig value is smaller for option (b), but at least the infor- 
mation it contains about the local warm temperatures in the 
Netherlands is independent from EAF1. Given these consid- 
erations, we conclude option (b), constructing EAFs that are 
temporally uncorrelated is the best option. 



The scatterplots of b\ 



(L c =50) 



and b 



(L c =50) 



against the posi- 



tive temperature anomalies in the Netherlands for EAF1 and 
EAF2 that are uncorrelated in time are shown in Fig. [6] Com- 
pared to the EOF with the largest r value (EOF1, see Fig. 0, 
the relationship of ^ ic-J °) to temperature is much stronger. 
The r value of the first EAF is almost a factor of 2 larger. The 
main contribution to the first EAF is from the first EOF, but 
also EOFs 3,4 and 6 contribute substantially. Only two EAFs 
are found with a clear connection (i.e., a tilt in the scatter- 
plot) to warm extremes in the Netherlands. This information 
was spread mainly between EOFs 1, 3, 4 and 6. Regressing 
Z500 anomalies upon the temperature time-series in the Bilt 
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Fig. 7. Daily Z500 anomaly field regressed on daily mean temperatures in The Netherlands in July and August in meters/Kelvin (left). 
Composite Z500 daily anomaly field for 5 warmest days in July and August in the Netherlands over the period 1958-2000 in meters. 



gives a pattern that resembles EAF1 (Fig. |7J. Also a simple 
compositing (averaging the 5 percent hottest days) yields a 
pattern very similar to EAF1 (Fig. |7J. In addition to this, the 
EAF method is able to identify another, less dominant, flow 
configuration that leads to warm weather in the Netherlands 
through advection of warm airmasses from eastern Europe. 
Comparing EAF 1 to the clusters o f summer Z500 anoma- 
lies published in ICassou et al. ( 20051) . we note that EAF1 is 
a combination of their 'blocking' and Atlantic low' regimes 
that favour warm conditions in all of France and Belgium 
(temperatures in the Netherlands were not analyzed). The 
easterly flow regime is not present in their clusters. 

In order to check that this method to identify the relevant 
large-scale atmospheric circulation patterns for warm days in 
the Netherlands is robust, we have also performed the same 
analysis for the first 21 years (1958-1978) of daily summer- 



time data and the last 21 years (1980-2000). In both cases we 
found very similar EAF1 patterns and corresponding scatter 
plots as for the full period. EAF2 however is only recovered 
in the second period. One interpretation of this is that EAF2 
is less frequently presen t in the first period. As argued by 
Liu and Opsteeghl (11995b this variation could be entirely due 
to the chaotic nature of the atmospheric circulation and need 
not be caused by a factor external to the atmosphere (as for 
instance increasing levels of greenhouse gases, changes in 
sea surface temperatures or solar activity to name a few). 

Instead of taking all positive temperature anomalies, a 
threshold could be introduced to Analise only the more ex- 
treme warm days. However, limiting the analysis to the 30% 
warmest positive temperature anomalies did not qualitatively 
change the first two EAFs. Also varying the value of the 
power applied to the temperature anomaly from 1 to 3, only 
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quantitatively modified the resulting EAFs, but not qualita- 
tively. A final test of robustness was that we limited the anal- 
ysis to a smaller domain. Again we found the same two EAF 
patterns on a much smaller domain from 20 degrees east to 
32.5 degrees west and 35 to 70 degrees north. The method 
thus produces robust patterns. 

5 Discussion: Applicability of the Extreme Associated 
Functions 



funded by the European Commission's 6th Framework Programme 
through contract GOCE-CT-2003-505539. 



Appendix Calculation of the EAFs by a repetitive maxi- 
mization procedure 

Since the entire appendix describes the procedure to calculate 
&k , i.e., the 6^-values for a given L, we drop all superscripts 
involving L for the sake of notational simplicity. 



The Extreme Associated Function method developed in this 
study to establish the connection between local weather ex- 
tremes and large-scale atmospheric circulation structures has 
several potentially useful applications. 

First of all, since this method proved to satisfy several tests 
of rigor and robustness for the temperature extremes in the 
Netherlands, it can be applied for local temperature extremes 
at any other place, or for that matter for other forms of ex- 
treme local weather conditions as well, like precipitation or 
wind. In this sense the method is quite general. 

EAFs can be used to evaluate the performance of cli- 
mate models with respect to the occurrence of local weather 
extremes. The EAF method helps to answer the question 
whether the climate model is able to generate the same pat- 
terns that are found in nature to be responsible for local 
weather extremes with a similar probability of occurrence in 
an objective manner. In addition, to evaluate the impact of 
climate change on local weather extremes, the EAF method 
helps to answer the question whether the probability of cer- 
tain local weather extremes changes in future scenario sim- 
ulations due to a change in the probability of occurrence of 
the EAFs. 

It might be found that some climate models are able to 
simul ate the EAFs, but do n ot reproduce the local extremes 
well. iLenderink et al.1 ( 12007b for instance found that regional 
climate models forced with the right large-scale circulation 
structures at the domain boundaries nevertheless tended to 
overestimate the summer temperature variability in Europe 
due to deficiencies in the description of the hydrological cy- 
cle. The EAFs can be used to correct the model output for 
this discrepancy by applying the observed statistical relation- 
ship between the EAFs and the local extremes to the model 
generated EAFs. 

By choosing the particular form of r in Eq. © as the quan- 
tity to be optimized, the EAF method turns out to be equiva- 
lent to multiple linear regression. Other measures to describe 
the statistical relationship between circulation and tempera- 
ture present in the scatterplot of Fig. |3]could be designed that 
would make the EAF method different from a multiple linear 
regression technique. In this sense, the EAF method is more 
general and potentially can be improved by designing a more 
apt measure. 
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A. 1 . Calculation of the first EAF 

To calculate which set of coefficients Cji maximize the value 
of r\ as expressed in Eq. dS) we take the variation of r\ w.r.t. 
variations 6cj%, and using Eq. ©, obtain 



Sri 

L 



2 x 



^{(fWa fe W} p (f(t) ai W)p-[rr x ] 2 (a fc (i)aiW}p}Qi^ fcl 



1=1 



forfc= (A.l.l) 



This means that with the l.h.s. of Eq. ( IA.1.U set to 
zero at the maximum of r\ for any choice of Scki, we 
obtain a generalized eigenvalue equation: if we denote 

(f(t)a k ( t)) p (f (t)ai(t)) p by 5 fc a u and [(a k (t)ai(t)) p ] by v 
then Eq. ( IA.1.U leads to 



^{a fc a ; - [r'n 2 ^} c u = o. 



(A. 1.2) 



Equation dA.1.21 ) can be written as a matrix equation 



Aci= [r™ x ] 2 V 2 



(A. 1.3) 



where the (fc, Z)-th element of matrices A and V 2 are given 
by dk di and respectively, and the Z-th element of the col- 
umn vector Ci is given by qi- Note that in Eq. jA.1.21 
v kl ^ ^ kl > smce tne time-average is defined only over the 
days for which T2m(i) > 0. Since matrix A is a tensor prod- 
uct of two column vectors A = aa T , where the superscript 
"F indicates transpose, the matrix equation (IA. 1 ,3b has only 
one eigenvector with non-zero eigenvalue, given by 



Ci oc V a. 
or equivalently 



c,i 



L 



i=l 



(A. 1.4) 



(A. 1.5) 



The corresponding optimized value r\ is determined from 
Eq. (IAT31 . 

The equivalence between the maximization of r\ and the 
multiple linear regression of T p (t) on the timeseries of the 
EOF amplitudes a^^'s [see Eq. ©] is apparent by noticing 
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that the above solution for Ci is the same as the solution of 
the multiple linear regression problem given in Eq. ( TTOb . 

How the EAFs are determined from the coefficients Cjk is 
shown in the next section in which we explain the calculation 
of the remaining (L — 1) EAFs. 

A.2. Calculation of the remaining (L — 1) EAFs 

As explained in the text, the calculation of the remaining 
(L — 1) linear combinations requires a choice between two 
options, (a) The patterns are orthogonal in space, or, (b) the 
amplitude timeseries are uncorrelated in time. We will show 
the implementation of both options. 
We first discuss option (a). 

Combining the expansion of Z500(£) into EOFs as in Eq. 
(f2]i and into EAFs as in Eq. (fTTT l gives the following relation 
between the EOFs and EAFs 

L 

ei=^c ife f fe fori = l, ...,L (A.2.1) 

fc=i 

Option (a) demands the EAFs to be orthonormal in space 
which leads to the following condition for the correspond- 
ing coefficients cjk where we start from the orthonormality 
condition of the EOFs 

L L 

e i ' e j ' = X! Cik c i l ffc ' f ' = Cik c i k = ^ ' (A.2.2) 

k,l=l k=l 

Additionally, it can be easily shown that 

L 

^ <Hk en = Ski- (A.2.3) 

i=i 

Using Eq. ( |A.2.3t , it is now straightforward to show from 
Eq. dA.2.1t that the EAFs can be calculated from the EOFs 

as 

L 

f fc =^c 4fc ei for k = 1, . (A.2.4) 

i=l 

Using this definition for the EAFs, the corresponding ampli- 
tudes bk(t) are found by 

b k (t) = f fe • Z500(i). (A.2.5) 
We now discuss option (b). 

For option (b), Eqs. (fTTT > and JA.2.5b cannot hold simul- 
taneously. To obtain the coefficients cy for this option, we 
start with Eqs. ( fTTT l and define 

L L 

h(t) = X! Cik a *(*) = ° ik ek ' 2500 W 

i=l i=l 

= gfc • Z500(t) (A.2.6) 

Then the conditions that bk(t) and bi(t) are uncorrelated in 
time, i.e., 

(bk(t)b l (t)) = 5 kl (A.2.1) 
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yields, using the fact that the EOF amplitudes are uncorre- 
lated in time, 

L L 

^ c ik (ai(t) a,j(t)}cji = ^ c ik a i °u = hi- (A.2.8) 

i,j i 

We then define 

L 

ik^^Cikafe, forfc = l,...,i (A.2.9) 

i=l 

in terms of which Eq. ( |A.2.7t can be re-expressed as 

ffc •» = *«• (A.2.10) 

Note here that for option (a) the patterns are automati- 
cally normalized to unity. For option (b), the patterns can 
be normalized to one, but the normalization of gfc should be 
adjusted as well in order for Eq. dA.2.10b to remain valid. 

To obtain the rest of the (L — 1) EAFs, the procedure de- 
scribed in Appendix A. 1 needs to be repeated L — 1 times, but 
certain care needs to be taken because of the orthonormality 
condition imposed by the definition of the set of EAFs. When 
these subtle issues are taken into account, the procedure be- 
comes a repetition of the following three steps. 

(i) Construct Z500'(t), the Z500 daily anomaly field that 
lies within the vector subspace of the first L EOFs but 
orthogonal to the first EAF This is achieved in the fol- 
lowing manner. 

First define 

e^- = e, - (ej • fi)fi = e, - cji fi (A.2.11) 

for j = 2, . . . , L. The dot product of both sides of Eq. 
JA.2.11b with Z500(t) then yields 

a'j{t) = dj(t) - Cjibi(t) for j = 2,...,L (A.2.12) 

for option (a). For option (b), the corresponding expres- 
sion is 

a'j (t) = a, (t) - c,-i a) h (t) for j = 2, . . . , L. (A.2. 13) 

(ii) Calculate the coefficients c'j 2 for j = 2, . . . , L that max- 
imize T2 . 

L 

r2c' ]2 = ^^(OaJWJp 1 (T p (t) a'S)) v (A.2.14) 

i=l 

with 

L L 

few = = E c ^° 2 W- (A - 2 - 15) 

i=2 3 = 1 
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(iii) Next the coefficients Cj2 are calculated from the coeffi- 
cients c'- 2 . For option (a) substitution of Eq. ( IA.2. 12b 
into Eq. ( IA.2. 15b leads to 

L 

c j2 = c' j2 - Cji ^2 c 'a Cil for J = 2, . . . , L, (A.2.16) 

i=l 

with the convention that c' 12 = 0. For option (b) substi- 
tution of Eq. ( 1A.2.13I ) into Eq. ( IA.2. 15t leads to 

L 

c j2 = Cj-2 - cji J^<4 erf Cji for j = 2, . . . , L, (A.2.17) 

i=l 

with the convention that c' 12 = 0. 

These steps are to be repeated until all L coefficient vec- 
tors have been determined. For option (a) the EAFs are 
then determined from Eq. dA.2.41 . for option (b) from 
Eq. dA2~9l) . 
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